Deep learning visual field global index prediction with optical coherence tomography parameters in glaucoma patients

The aim of this study was to predict three visual filed (VF) global indexes, mean deviation (MD), pattern standard deviation (PSD), and visual field index (VFI), from optical coherence tomography (OCT) parameters including Bruch's Membrane Opening-Minimum Rim Width (BMO-MRW) and retinal nerve fiber layer (RNFL) based on a deep-learning model. Subjects consisted of 224 eyes with Glaucoma suspects (GS), 245 eyes with early NTG, 58 eyes with moderate stage of NTG, 36 eyes with PACG, 57 eyes with PEXG, and 99 eyes with POAG. A deep neural network (DNN) algorithm was developed to predict values of VF global indexes such as MD, VFI, and PSD. To evaluate performance of the model, mean absolute error (MAE) was determined. The MAE range of the DNN model on cross validation was 1.9–2.9 (dB) for MD, 1.6–2.0 (dB) for PSD, and 5.0 to 7.0 (%) for VFI. Ranges of Pearson’s correlation coefficients were 0.76–0.85, 0.74–0.82, and 0.70–0.81 for MD, PSD, and VFI, respectively. Our deep-learning model might be useful in the management of glaucoma for diagnosis and follow-up, especially in situations when immediate VF results are not available because VF test requires time and space with a subjective nature.

that of all three OCT parameters combined.These results suggest that BMO-based optic disc assessment might be a better evaluation for different aspects of the optic disc than conventional disc assessments in the diagnosis of glaucoma.
Previous structure-function studies have used deep learning models to predict global VF indexes including mean deviation (MD) from OCT-derived images such as RNFL thickness maps 26,27 .Other previous studies have predicted pointwise threshold of VF from OCT-derived image scans like peripapillary RNFL or macular ganglion cell complex thickness maps [28][29][30][31] .However, none of these previous studies included any information regarding BMO-MRW.Moreover, none predicted all three VF global indexes of MD, pattern standard deviation (PSD), and visual field index (VFI) from OCT-derived images or maps.Each global index of VF test has its own advantage, and therefore, only one index cannot tell all the aspects of VF test results.Actual figures of global indexes of VF could provide an outline of VF summary, which might be clinically useful in the management of glaucoma including diagnosis and detection of progression.
Thus, the aim of the present retrospective cross-sectional study was to predict three VF global indexes using deep-learning model from OCT-derived parameters of BMO-MRW and RNFL.We intended to assess the usefulness of this deep-learning model as a reference in glaucoma clinic.It might be beneficial in situations when immediate VF results are not available since VF test takes time and cooperation of the patient.We applied a deep-learning model to integrate all data available from spectral-domain OCT images to predict VF global indexes, which might be challenging for general physicians.

Baseline characteristics of subjects
A total of 720 eyes (720 patients) with glaucoma and glaucoma suspect (GS) were included in the final analysis.Glaucoma diagnosis included early normal-tension glaucoma (NTG), moderate stage of NTG, pseudo exfoliation glaucoma (PEXG), primary angle closure glaucoma (PACG), and primary open angle glaucoma (POAG).The mean age of glaucoma patients was 53.7 ± 13.3 (mean ± standard deviation) years.Females accounted for 46%  A total of 720 eyes from 720 patients were used.Sixteen sub-parameters were used as input parameters in the dataset.Three DNN models were built and trained independently to predict the value of each VF global index: MD, PSD, and VFI.These models had three hidden layers and a single output layer.Exponential linear unit (ELU) was used as activation function.Batch normalization was applied after each hidden layer.The three models were constructed with the same structure.The model for each VF global index (MD, PSD, and VFI) had minor differences in the number of nodes and the degree of regulation in detail.To improve model performance, we applied fivefold cross validation and tuned model hyper-parameters such as learning rate, the degree of regulation, the number of layers, and the number of nodes in each layer.In each fold, the validation set consisted of 137 eyes (137 patients) and the training set consisted of 547 eyes (537 patients).We calculated the MAE in the validation set for each VF global index.To evaluate predicting performance, mean absolute error (MAE), Pearson's correlation coefficient, and R 2 of each model were calculated, and the results showed in Table 4.The overview of the workflow of each model is illustrated in Fig. 1A. Figure 1B shows the detailed structure of the DNN model.

Predictive performances of DNN and ML models
To evaluate performance of prediction for our DNN model, we calculated MAE for each VF global index with the validation set.The loss curves of the DNN model for predicting VF global indexes with increasing number of epochs was plotted in Fig. 2A-C.With these loss functions of each index, it was verified that the performance of the DNN model was stable and robust.We also trained other machine learning (ML) models: Random Forest, extreme gradient boosting (XGBoost), and support vector machine (SVM) using Radial Basis Function (RBF) kernel to compare their performances with the DNN model.The MAE of VFI ranged from 5.0 to 7.0% (6.3-6.9% for Random Forest, 6.5-7.4% for XGBoost, and 6.5-8.0%for SVM using RBF kernel).These results are summarized in Table 3.

Comparison of actual and DNN predicted values of VF global indexes
Statistical analysis was proceeded to compare actual data of each VF global index with data predicted by the DNN model.Figure 2G-I show scatter plots of predicted and actual values of three indexes (MD, PSD, and VFI) in the dataset.Pearson's correlation coefficient and R 2 were also measured.Between predicted values and actual values of MD in the fivefold cross validation, Pearson's correlation coefficient was in the range of 0.76 to 0.85 ( p < 0.001) .In the PSD estimation, Pearson's correlation coefficient ranged from 0.74 to 0.82 ( p < 0.001) .In VFI prediction, the Pearson's correlation coefficient ranged from 0.70 to 0.81 ( p < 0.001) .In addition, R 2 ranges were 0.59-0.65,0.58-0.66,and 0.58-0.65 for MD, VFI, and PSD, respectively.Statistical results of the DNN on five-fold cross validation are summarized in Table 4.

Predictive performances of DNN model according to OCT-derived parameters
We evaluated performances of DNN model for predicting VF index (MD) according to the OCT-based parameters respectively: BMO-MRW alone, RNFL alone, and both BMO-MRW and RNFL combined.The mean absolute error (MAE) of the DNN model based on the parameters of BMO-MRW alone and RNFL alone were 2.72 dB and 2.87 dB, respectively.The performance of the DNN model based on both BMO-MRW and RNFL combined was 2.28 dB of MAE, which showed the smallest value.

Deep learning predictive performance analysis according to glaucoma severity
To evaluate the predictive performances of the DNN model according to glaucoma severity, we measured absolute errors of the actual value and predicted value of MD for each eye.Figure 3A shows a scatter plot of absolute error showing the prediction performance according to the actual MD values of each eye.

Discussion
To our knowledge, the present study was the first to predict all of VF global indexes including MD, PSD, and VFI from OCT-derived parameters of BMO-MRW, a new parameter, and RNFL using a deep learning model.We found that the performance of our DNN model was outstanding along with other machine-learning models in predicting VF global indexes.For all three indexes, the DNN model showed the best performance.We also found that there was a strong correlation between each predicted value and the actual value.
The availability of BMO-MRW obtained from spectral-domain OCT has grown for clinicians.It provides some advantages when compared to the previous standard morphometric optic nerve head analysis confocal scanning laser tomographic measurements [21][22][23] .Compared to existing ophthalmic examinations, BMO-MRW allows for a more precise geometric assessment of the neuroretinal rim (NRR) [15][16][17]20 . It as been shown that BMO-MRW is advantageous in providing an accurate reflection of the amount of neural tissue present in the optic nerve 32 .Our previous study reported a high diagnostic performance in discriminating early normaltension glaucoma (NTG) from glaucoma suspect (GS) (AUC, 0.966) based on a deep learning model using OCT parameters of BMO-MRW, peripapillary RNFL, and color classification of RNFL 25 .Interestingly, BMO-MRW, as a single parameter, provided a higher diagnostic performance (AUC: 0.959) than RNFL alone (AUC: 0.914) and RNFL with its color code classification (AUC: 0.934) 25 .Moreover, BMO-MRW alone showed similar diagnostic performance to that of all three OCT parameters combined.These results suggest that BMO-based optic disc assessment might be a better evaluation for different aspects of the optic disc than conventional disc assessments in the diagnosis of glaucoma.These findings suggest that BMO-MRW is clinically useful in the diagnosis of glaucoma.It might be even better than conventional RNFL. Inegrating assessment of BMO-MRW and RNFL is beneficial for better diagnosis of glaucoma based on these findings.However, the integration of these two different parameters is a complex and challenging for human beings, including general physicians other than glaucoma specialists.This is where the latest technology of artificial intelligence can be useful.Recent reports indicate that machine-learning classifiers can aid in clinical practice and efficiently enhance glaucoma diagnosis for general www.nature.com/scientificreports/ophthalmologists in the primary eye care setting when there is a lack of glaucoma specialists 33 .The deep learning model can provide rapid diagnostic results in the clinics after inputting ophthalmic examination data without the need for a multi-day analysis. Ulimately, the decision to treat glaucoma is up to the physician, but the deep learning model can suggest a preliminary diagnosis for reference 34 .Moreover, the DNN diagnostic model is more  www.nature.com/scientificreports/cost-effective clinically easy to access compared to other imaging-based CNN diagnostic programs that require costly equipment, such as workstations with GPUs and take several days to produce results.A previous study by Park et al. 29 has predicted VF regional thresholds with deep learning based on inception V3 using combined OCT images of macular ganglion cell-inner plexiform layer (mGCIPL) and peripapillary pRNFL thicknesses maps.They conducted pointwise estimation of VF for a regional analysis.With the deep learning method, the root mean squared error (RMSE) of the entire VF area for all patients was 4.79 ± 2.56 dB (mean ± standard deviation).In our study, we estimated global VF.The MAE of MD was found to be 2.57 ± 0.33 dB.Our results showed lower MAE, suggesting better results in predicting the entire VF threshold. Heelings et al. 31 have conducted a study to predict VF MD and 52 threshold values based on a customized CNN model with Xception using peripapillary RNFL map and scanning laser ophthalmoscopy en face images.The MAE for MD estimation the deep learning model was 2.89 dB (range, 2.50-3.30dB).
In our study, the MAE for MD prediction was 2.57 dB (range, 1.95-2.87dB).Therefore, the present study showed lower MAE, indicating better results for predicting the entire VF threshold.Christopher et al. 26 have developed a deep learning system based on ResNet50 to predict MD, PSD, and mean VF sectoral pattern deviation (PD) using image data of RNFL thickness map, RNFL enface image, and confocal scanning laser ophthalmoscopy image.In MD estimation, the deep learning model with RNFL enface image achieved the highest performance with R 2 of 0.70 (range, 0.64-0.74)and MAE of 2.5 dB (range, 2.3-2.7 dB).In PSD estimation, R 2 was 0.61(range, 0.55-0.66)and MAE was 1.5 dB (range, 1.4-1.6 dB).Our deep learning model, which utilized combined parameters of RNFL and BMO-MRW, demonstrated similar performance to other previous studies.It could also predict additional VF global indexes such as VFI.Results of our study were highly comparable to those of previous research, thus having a significant meaning.Yu et al. have used 3D CNN model to estimate VF global indexes of MD and VFI, but not all three indexes from combining macula and optic disc OCT scans in healthy, glaucoma suspect, and glaucoma patients 27 .Each global index of VF test has its own advantage, and thus, only one index cannot tell all the aspects of the entire VF results.For example, MD is useful to estimate the overall stage of glaucoma.On the other hand, PSD reflects the focal VF defect in an early stage of glaucoma, which is beneficial in the diagnosis of early glaucoma.
Using the deep learning model based on macular and optic nerve head scans, the MAE was 1.57 dB for MD and 2.7% for VFI.Yu et al. have shown great results with a larger number of images.However, their study included multiple visit data from one patient to have a larger number of images.We used single visit data from each subject, which might be more independent and reliable.Moreover, we used data extracted from OCT using lighter and cost-effective model to predict VF global indexes.Our results were quite comparable to results of the study by Yu et al. using images from OCT with a more complicated model.Results of VFI seemed to be better in the study by Yu et al. (2.7 dB for VFI).However, considering VFI percentage in our study, results were substantially good.The VFI reflects RGC loss and function as a percentage, with central points having more weights 35 .It is expressed as a percentage of remaining proportion of visual function.It is a reliable index on which glaucomatous visual field severity staging can be based.VFI can also be used to calculate the rate of progression which is shown in trend-based glaucoma progression analysis of Humphrey Field Analyzer software 36 .While VFI is important in the management of glaucoma, previous studies have predicted that this global index (VFI) is rare to be found in the field of AI (artificial intelligence) using deep learning methods.Most of previous studies have mainly focused on predicting MD as a global index from different images of OCT or HRT device [26][27][28][29][30][31] .Our study also had a significant meaning in that we predicted VFI as a global index from extracted OCT data.This has not been reported before in the field of AI using deep learning method.
The result of the current study has a significant clinical meaning in that it provides summary outline numbers of functional VF test from structural OCT test.OCT test is objective.It offers quantitative values of optic nerve head parameters.However, VF requires patient cooperation, a relatively long time, and designated space to be performed.Sometimes and quite frequently, VF test results are not available at the time of clinical practice.Since VF test also requires cognitive ability and motor reaction, for old patients and those with dementia or stroke and/or those with motor disability, VF test cannot be performed correctly.Moreover, in some clinics, VF tests need appointment.They cannot be done at the first visit because all appointed VF tests are being performed at that time.If that patient cannot come back in a short time, VF test can be delayed for a very long time.Thus, correct diagnosis of glaucoma or decision for the disease progression is difficult to be made.In such situations, if summary results of VF test could be predicted from OCT test without actually performing the VF test, it could be clinically very helpful in the management of glaucoma.Especially, in our deep learning VF global indexes prediction model, the performance of the prediction was the best in early stage of glaucoma based on the MAE as shown in Fig. 3A.Early stage of glaucoma or glaucoma suspects usually visit glaucoma clinic to be diagnosed of glaucoma for the first time and in these cases VF test results are necessary.Our relatively quick DNN model may be also useful in these situations, which frequently occur in clinics.
NTG comprises the majority (76.3%) among patients with POAG in Asian populations as reported by previous population-based studies 37 .Thus, information regarding NTG is clinically important for Asians.It applies to Asian countries and also other countries elsewhere with a substantial proportion of Asian population.However, previous deep-learning studies rarely included NTG.It is difficult to find studies including data of NTG or those even classified NTG.As previous deep-learning studies including data of NTG are scarce, the current study might have a significant meaning to be added in the literature for providing additive information and future deep-learning studies in the field of glaucoma.
The current study had several limitations.First of all, there are potential limitations owing to its retrospective design.We included only those who had taken both RNFL and BMO-MRW tests with an acceptable images quality.In addition, only those who had reliable VF tests were included.The impact of the subject selection on our results remains unclear.Second, the study was conducted at a referral university hospital within the province using a hospital-based design, rather than a population-based approach.
Vol:.( 1234567890 The individuals included in the study may not be fully representative sample of the general population.Additionally, this study included only Korean patients.Thus, results of our study, including NTG, might not be applicable to other ethnic groups.Third, it should be considered that the sample size of this study is relatively small.Although 720 subjects with either glaucoma or GS were included in this study, this number might not be insufficient to train or test the performance to predict a single test result from single device data.Other studies with large number used both eyes from multiple visits.However, we used only one randomly selected eye from one person from a single visit.Our data might be more independent and more reliable/correct than previous studies.If we have included both eyes from multiple visits, the number of data could be much larger, for example, six times.Finally, the analysis of OCT images utilizing deep neural network (DNN) in this study was based on the extraction of numerical data from the images rather than using direct images.However, it is still meaningful in that clinicians can use deep-learning models with free open-sources to obtain prompt results and get aid in the management of glaucoma.This approach is more economically feasible than using convolutional neural networks (ConvNets) for image analysis, which can be costly to achieve high accuracy.We might consider developing our own program to be used in clinical practice to aid preliminary diagnosis from direct OCT-image analysis employing ConvNets in future studies achieving accurate performance.
In conclusion, our DNN model showed high performance in predicting VF global indexes of MD, PSD, and VFI based on OCT-derived parameters of BMO-MRW, a new parameter, and RNFL.Prediction based on VFI was the highest, followed by that on MD and PSD using our DNN model in GS and glaucoma patients.Our DNN model might be beneficial in clinical practice in the management of glaucoma including diagnosis and monitoring progression.Given that our DNN model provides prompt outputs, it has the potential to the particularly valuable in settings where there are no glaucoma specialists available, such as primary eye care.Nonetheless, a more conclusive determination would require a larger, multi-center study with a substantial patient cohort.

Ethics statement
This retrospective observational, cross-sectional study was conducted in accordance with the tenets of the Declaration of Helsinki.It was approved by the Institutional Review Board (IRB) of Gyeongsang National University Changwon Hospital, Gyeongsang National University School of Medicine.The requirement for informed consent was waived by the IRB of Gyeongsang National University Changwon Hospital due to its retrospective nature.

Subjects
Among 1487 patients with glaucoma and glaucoma suspects who were evaluated between February 2016 and December 2021 in a glaucoma clinic at Gyeongsang National University Changwon Hospital, a total of 720 eyes (720 subjects) were included.Glaucoma diagnosis included early NTG, PACG, PEXG, POAG, and GS.Subjects consisted of 224 eyes of those with GS, 245 eyes of those with early NTG, 59 eyes of those with moderate stage of NTG, 36 eyes of those with PACG, 57 eyes of those with PEXG, and 99 eyes of those with POAG.The study included only those participants who met the diagnostic criteria below and demonstrated reliable results for both BMO-MRW and RNFL.
Diagnosis of glaucoma was assessed by a single glaucoma specialist (H-k Cho) applying consistent criteria.To diagnose NTG, patients needed to meet specific criteria, including having an IOP ≤ 21 mmHg without treatment who demonstrated glaucomatous optic disc injury and corresponding VF loss, an open-angle assessed by gonioscopic inspection, and no other underlying cause of optic disc injury other than glaucoma 38 .Early NTG was defined as the VF test results of MD > − 6.0 dB.PACG was determined as eyes with shallow anterior chamber (appositional contact between the peripheral iris and the trabecular meshwork (TM) > 270 degrees on gonioscopy and showed glaucomatous optic disc damage (decline of NRR with a vertical cup-to-disc ratio of 0.7 or an asymmetry between eyes of 0.2, or notching ascribe to glaucoma) and showing corresponding visual field defects 39 .To diagnose PEX glaucoma, the criteria included the observation of PEX material at the margin of the pupil and on the anterior lens capsule after maximal pupil dilatation, along with the presence of baseline IOP of at least 22 mmHg, glaucomatous optic nerve head damage, visual field loss consistent with optic disc injury, and the absence of other conditions causing secondary glaucoma 40 .POAG was defined as a patient with a baseline IOP of more than 21 mmHg prior to treatment who showed findings of glaucomatous optic nerve head injury and corresponding VF loss, an open-angle assessed by gonioscopic inspection, and no other underlying cause for optic nerve head injury besides glaucoma 1 .
The exclusion criteria were as follows: low-quality image scans resulting from eyelid blinking or poor fixation, history of optic neuropathies aside from glaucoma or an acute angle-closure crisis that could affect the thickness of the RNFL or BMO-MRW (e.g., optic neuritis, acute ischemic optic neuritis), history of any intraocular surgery except for uneventful phacoemulsification, and retinal disease associated with retinal swelling or edema and subsequent RNFL or BMO-MRW swelling.Preperimetric glaucoma was excluded from the current study.Subjects were not excluded by axial length or refractive error, or the size of optic disc for the present study.

Optical coherence tomography
Imaging of spectral-domain OCT was accomplished using the Glaucoma Module Premium Edition.Radial B-scans of 24 in number were acquired to analyze BMO-MRW.Among three scan circle diameters (3.5, 4.1, and 4.7 mm), a scan circle diameter of 3.5 mm was chosen for peripapillary RNFL thickness measurement.Only those images that were correctly centered and accurately segmented and quality scores ≥ 20 were selected for this study.Images taken with OCT were aligned in FoBMO axis, that is an individual specific axis that measures between the center of BMO and the fovea of macula.Employing this FoBMO axis could enable more correct

Figure 1 .
Figure 1.(A) Workflow of this study.Input data extracted from OCT images are used to predict VF indexes (MD, PSD, VFI) through a DNN model.Detailed structure of dashed line box with a red star is described in (B).(B) Detailed structure of the DNN model.Each number above each box represents the number of nodes in the prior layer.The number below each box means the number of nodes in the present layer.OCT = optical coherence tomography; MD = mean deviation; PSD = pattern standard deviation; VFI = visual field index; DNN = deep neural network.

Figure 2 .
Figure 2. (A-C) Loss curve of the DNN model for predicting VF global indexes, MD, PSD, and VFI.The blue line is for the training set and the orange one is for the validation set.The axis x is epoch and the axis y is the value of each loss function.(D-F) Comparison of MAE for predicting VF global indexes on fivefold cross validation.In each figure, blue, orange, green, and red bar represent the MAE of XGBoost, Random Forest, SVM with RBF kernel, and the DNN model, respectively.The black bar on all bars means the standard deviation on a fivefold cross validation.The axis x is MAE value.(G-I) Scatter plots of deep learning predicted and actual values of three indexes (MD, PSD, and VFI) in the dataset.Blue, orange, and green points mean training set, validation set, and test set, respectively.The axis x means predicted value from the DNN model and the axis y is actual value.VF = visual field; MD = mean deviation; PSD = pattern standard deviation; VFI = visual field index; MAE = mean absolute error; DNN = deep neural network; XGBoost = extreme gradient boosting; SVM = support vector machine; RBF = radial basis function.

Workflow of deep learning model for predicting visual field global indexes We
aimed to estimate three VF global indexes, MD, PSD, and VFI among parameters of BMO-MRW and RNFL based on deep learning.The main workflow of our deep learning model for predicting visual field indexes is as follows.First, we extracted numerical parameters of BMO-MRW and RNFL from OCT scan images using Heidelberg licensed software and included the age of patients in the dataset to train and test the deep neural network (DNN) model.
The mean absolute error (MAE) of the DNN model was 2.19 ± 1.84 dB in the test set as shown in Fig.3B.The prediction performance for each glaucoma severity in the test set is as follows.The MAE for unaffected control (NN; MD ≥ 0.0) was 1.76 ± 1.31 dB, and the mild glaucoma grade (G1; − 6.0 < MD < 0.0) showed its MAE was 2.05 ± 1.98 dB.The MAE for moderate glaucoma grade (G2; − 12.0 < MD ≤ − 6.0) class was 2.17 ± 0.87 dB, and the severe glau- coma grade (G3; MD ≤ − 12.0) was it MAE was 3.58 ± 2.75 dB.It is noticeable that the MAE of the early stage of glaucoma is the smallest among all the stages of glaucoma.

Table 3 .
The MAE for DNN model with other machine learning algorithms.MAE mean absolute error; SD standard deviation; MD mean deviation; PSD pattern standard deviation; VFI visual field index; SVM support vector machine; RF random forest; XGB extreme gradient boosting; DNN deep neural network.